Abstract
Visual illusions highlight how the brain uses contextual and prior information to inform our perception of reality. Unfortunately, illusion research has been hampered by the difficulty of adapting these stimuli to experimental settings. In this set of studies, we used the parametric framework for visual illusions implemented in the Pyllusion software to generate 10 different classic illusions (Delboeuf, Ebbinghaus, Rod and Frame, Vertical-Horizontal, Zöllner, White, Müller-Lyer, Ponzo, Poggendorff, Contrast) varying in strength. We tested the objective effect of the illusions on errors and reaction times in a perceptual discrimination task, from which we extracted participant-level performance scores (n=250). Our results provide evidence in favour of the existence of a general factor (labelled Factor i) underlying the sensitivity to different illusions. Moreover, we report a positive relationship between illusion sensitivity and personality traits such as Agreeableness, Honesty-Humility, and negative relationships with Psychoticism, Antagonism, Disinhibition, and Negative Affect.
Visual illusions are fascinating stimuli capturing a key feature of our neurocognitive systems. They eloquently show that our brains did not evolve to be perfect perceptual devices providing veridical accounts of physical reality, but integrate prior knowledge and contextual information - blended together in our subjective conscious experience (Carbon, 2014). Despite the longstanding interest within the fields of visual perception (Day, 1972; Eagleman, 2001; Gomez-Villa et al., 2022), consciousness science (Caporuscio et al., 2022; Lamme, 2020), and psychiatry (Gori et al., 2016; Notredame et al., 2014; Razeghi et al., 2022; Teufel et al., 2015), several important issues remain open.
Notably, the presence of a common mechanism underlying the effect of different illusions has been contested (Cretenoud et al., 2019; Cretenoud, Francis, et al., 2020; Hamburger, 2016); and the nature of the underlying processes - whether related to low-level features of the visual processing system (Cretenoud et al., 2019; Gori et al., 2016) or to top-down influences of prior beliefs (Caporuscio et al., 2022; Teufel et al., 2018) are strongly debated. The existence of dispositional correlates of illusion sensitivity is another area of controversy, with some studies reporting higher illusion resistance in patients with schizophrenia and autism (Giaouri & Alevriadou, 2011; Keane et al., 2014; Notredame et al., 2014; Park et al., 2022; Pessoa et al., 2008) and in individuals with stronger aggression and narcissism traits (Konrath et al., 2009; Zhang et al., 2017).
One key challenge hindering the further development of illusion research is the relative difficulty of adapting visual illusions to an experimental setting, which typically requires the controlled modulation of the specific variables of interest. To address this issue, we first developed a parametric framework to manipulate visual illusions, which we implemented and made accessible in the open-source software Pyllusion (Makowski et al., 2021). This software allows us to generate different types of classic visual illusions with a continuous and independent modulation of two parameters: illusion strength and task difficulty (see Figure 1).
knitr::include_graphics("figures/Figure1.png")
The parametric framework for visual illusions (Makowski et al., 2021) applied to the Müller-Lyer illusion (above). Below are examples of stimuli showcasing the manipulation of two parameters, task difficulty and illusion strength.
Indeed, many visual illusions can be seen as being composed of targets (e.g., same-length lines), of which perception is biased by the context (e.g., in the Müller-Lyer illusion, the same-length line segments appear to have different lengths when they end with inwards or outwards pointing arrows). Past illusion studies traditionally employed paradigms focusing on participants’ subjective experience, by asking them the extent to which they perceive two identical targets as different (Lányi et al., 2022), or having them adjust the targets to match a reference stimulus relying only on their perception (Grzeczkowski et al., 2018; Mylniec & Bednarek, 2016). Alternatively, Pyllusion allows the creation of illusions in which the targets are objectively different (e.g., one segment is truly more or less longer than the other), and in which the illusion varies in strength (the biasing angle of the arrows is more or less acute).
This opens the door for an experimental task in which participants make perceptual judgments about the targets (e.g., which segment is the longest) under different conditions of objective difficulty and illusion strength. Moreover, the illusion effect can be either “incongruent” (making the task more difficult by biasing the perception in the opposite way) or “congruent” (making the task easier). Although visual illusions are inherently tied to subjective perception, this framework allows a reversal of the traditional paradigm to potentially quantify the “objective” effect of illusions by measuring its behavioral effect (error rate and reaction times) on the performance in a perceptual task.
In the present set of preregistered studies, we will first test this novel paradigm by investigating if the effect of illusion strength and task difficulty can be manipulated continuously, and separately modeled statistically. Then, we will further utilize the paradigm to assess whether 10 different classic illusions (Delboeuf, Ebbinghaus, Rod and Frame, Vertical-Horizontal, Zöllner, White, Müller-Lyer, Ponzo, Poggendorff, Contrast) share a common latent factor. Finally, we will investigate how the the inter-individual sensitivity to illusions relates to dispositional variables, such as demographic characteristics and personality.
In line with open-science standards, all the material (stimuli generation code, experiment code, raw data, analysis script with complementary figures and analyses, preregistration, etc.) is available at https://github.com/RealityBending/IllusionGameValidation.
Study 1 can be seen as a pilot experiment aiming to gather some preliminary data to assess if the stimuli generated by Pyllusion behaves as expected for each of the 10 illusion types (i.e., whether an increase of task difficulty and illusion strength leads to an increase of errors), and develop an intuition about the magnitude of effects, to refine the stimuli parameters to a more sensible range (i.e., not overly easy and not impossibly hard) for the next study.
We generated 56 stimuli for each of the 10 illusion types. These stimuli resulted from the combination of 8 linearly-spread levels of task difficulty (e.g., [1, 2, 3, 4, 5, 6, 7], where 1 corresponds to the highest difficulty - i.e., the smallest objective difference between targets) and 7 levels of illusion strength (3 values of strength on the congruent side, 3 on the incongruent side, and 0; e.g., [-3, -2, -1, 0, 1, 2, 3], where negative values correspond to congruent illusion strengths).
The 10 illusion blocks were randomly presented, and the order of the 56 stimuli within the blocks was also randomized. After the first series of 10 blocks, another series was administered (with new randomized orders of blocks and trials). In total, each participant saw 56 different trials per 10 illusion type, repeated 2 times (total = 1120 trials), to which they had to respond “as fast as possible without making errors” (i.e., an explicit double constraint to mitigate the inter-individual variability in the speed-accuracy trade off). The task was implemented using jsPsych (De Leeuw, 2015), and the instructions for each illusion type are available in the experiment code.
Fifty-two participants were recruited via Prolific (www.prolificacademic.co.uk), a crowd-sourcing platform providing high data quality (Peer et al., 2022). The only inclusion criterion was a fluent proficiency in English to ensure that the task instructions would be well-understood. Participants were incentivised with a reward of about for completing the task, which took about 50 minutes to finish.
We removed 6 participants upon inspection of the average error rate (when close to 50%, suggesting random answers), and when the reaction time distribution was implausibly fast. For the remaining participants, we discarded blocks where the error rate was higher than 50% (possibly indicating that instructions got misunderstood; e.g., participants were selecting the shorter line instead of the longer one). Finally, we removed 692 (1.37%) trials based on an implausibly short or long response time (< 150 ms or > 3000 ms).
The final sample included 46 participants (Mean age = 26.7, SD = 7.7, range: [19, 60]; Sex: 39.1% females, 56.5% males, and 4.4% other).
The analysis of study 1 focused on the probability of errors as the main outcome variable. For each illusion, we started by visualizing the average effect of task difficulty and illusion strength to gain some intuition on the underlying generative model. Next, we tested the performance of various logistic models differing in their specifications, such as: with or without a transformation of the task difficulty (log, square root or cubic root), with or without a 2nd order polynomial term for the illusion strength, and with or without the illusion side (up vs. down or left vs. right) as an additional predictor. We then fitted the best performing model under a Bayesian framework, and compared its visualization with that of a General Additive Model (GAM), which has an increased ability of mapping underlying potential non-linear relationships (at the expense of model simplicity).
The analysis was carried out using R 4.2 (R Core Team, 2022), brms (Bürkner, 2017), the tidyverse (Wickham et al., 2019), and the easystats collection of packages (Lüdecke et al., 2021, 2019; Makowski et al., 2020; Makowski, Ben-Shachar, & Lüdecke, 2019).
The statistical models suggested that the effect of task difficulty had a cubic relationship with error rate for the Delboeuf and Ebbinghaus illusions (both composed of circular shapes), square relationship for the Rod and Frame and Vertical-Horizontal illusions, cubic relationship for the Zöllner and Poggendorff illusions, exponential relationship for the White illusion, cubic relationship for the Müller-Lyer and Ponzo illusions (both based on line lengths), and linear relationship for the Contrast illusion. All models suggested a significant effect of illusion strength and task difficulty. See details and figures in the analysis script.
This study provided a clearer understanding of the magnitude of the parametric effects at stake and the type of interaction between them. Furthermore, it allowed us to better understand and test the stimuli generated by Pyllusion, as well as uncover incidental bugs and technical issues (for instance, the specification direction of the illusion strength was reversed for a few illusions), which were fixed in a new software release. Crucially, this study allowed us to refine the range of task difficulty and illusion strength values in order to maximize information gain.
In most illusions, the task difficulty exhibited monotonic power-law scaled effects, which is in line with the psychophysics literature on perceptual decisions (Bogacz et al., 2006; Ditzinger, 2010; Shekhar & Rahnev, 2021). One notable result was the illusion effect pattern for the Zöllner illusion, which suggested a non-linear relationship. By generating a wider range of illusion strength values, the next study will attempt at clarifying this point.
The aim of study 2 was two-fold. In the first part, we carefully modeled the error rate and the reaction time of each illusion type in order to validate our novel paradigm and show that the effect of illusions can be manipulated continuously. In the second part, we derived the participant-level scores from the models (i.e., the effect of illusion strength for each individual) and analyzed their latent factors structure.
The paradigm of study 2 was similar to that of study 1, with the following changes: the illusory stimuli were re-generated within a refined space of parameters based on the results of study 1. Moreover, taking into account the findings of study 1, we used non-linearly spaced difficulty levels, depending on the best underlying model (i.e., with an exponential, square or cubic spacing depending on the relationship). For instance, a linear space of [0.1, 0.4, 0.7, 1.0] can be transformed to an exponential space of [0.1, 0.34, 0.64, 1.0].
Additionally, instead of repeating each stimulus two times, we generated illusions using more levels of difficulty and illusion strength. As such, for each illusion type, we generated a total of 134 stimuli that were split into two groups (67 stimuli per illusion block). Furthermore, instead of a simple break screen, we added two personality questionnaires between the two series of 10 illusion blocks (see study 3).
Using the same recruitment procedure as in study 1, we recruited 256 participants, out of which 6 were identified as outliers and excluded, leaving a final sample of 250 participants (Mean age = 26.5, SD = 7.6, range: [18, 69]; Sex: 48% females, 52% males). Please see study 3 for the full demographic breakdown. We discarded blocks with more than 50% of errors (2.16% of trials) and 0.76% trials with extreme response times (< 125 ms or > 4 SD above mean).
The first part of the analysis focused on modelling the effect of illusion strength and task difficulty on errors and reaction time (RT) within each illusion. In order to achieve this, we started by fitting General Additive Models (GAMs), which can accommodate possible non-linear effects and interactions. Errors were analyzed using Bayesian logistic mixed models, and RTs of correct responses were analyzed using an ex-Gaussian family with the same fixed effects entered for the location \(\mu\) (mean), scale \(\sigma\) (spread) and tail-dominance \(\tau\) of the RT distribution (Balota & Yap, 2011; Matzke & Wagenmakers, 2009).
Using GAMs as the “ground-truth” models, we attempted at approximating them using general linear models, which have the advantage of estimating the participant-level variability of the effects (via random slopes). Following a comparison of models with a combination of transformations (raw, log, square root or cubic root) on the main predictors (task difficulty and illusion strength), we selected and fitted the best model (based on their indices of fit), and compared their output visually (see Figure 2).
We then extracted the inter-individual variability in the effect of illusion strength and its interaction with task difficulty, and used it as participant-level scores. Finally, we explored the relationship of these indices across different illusions using exploratory factor analysis (EFA) and structural equation modelling (SEM).
The best models were \(log(diff)*strength\) for Delboeuf; \(sqrt(diff)*strength\) for Ebbinghaus; \(log(diff)*log(strength)\) for Rod and Frame; \(sqrt(diff)*sqrt(strength)\) for Vertical-Horizontal; \(cbrt(diff)*strength\) for Zöllner; \(diff*sqrt(strength)\) and \(log(diff)*strength\) respectively for errors and RT in White; \(sqrt(diff)*sqrt(strength)\) and \(sqrt(diff)*strength\) respectively for errors and RT in Müller-Lyer; \(cbrt(diff)*strength\) for Ponzo; \(cbrt(diff)*sqrt(strength)\) and \(cbrt(diff)*strength\) respectively for errors and RT in Poggendorff; and \(sqrt(diff)*sqrt(strength)\) for Contrast. For all of these models, the effects of illusion strength, task difficulty and their interaction were significant.
For error rates, most of the models closely matched their GAMs counterpart (see Figure 2), with the exception of Delboeuf (for which the GAM suggested a non-monotonic effect of illusion strength with a local minimum at 0) and Zöllner (for which theoretically congruent illusion effects were related to increased error rate).
knitr::include_graphics("figures/Figure2.png")